Questions

  1. Which dog breeds are the most popular? Has this changed over the past several years?
  2. What attributes are most important for predicting dog breed rankings?
  3. How do the top 5 dog breeds score on those key attributes?

Data obtained from https://github.com/rfordatascience/tidytuesday/tree/master/data/2022/2022-02-01

Thomas Mock (2022). Tidy Tuesday: A weekly data project aimed at the R ecosystem. https://github.com/rfordatascience/tidytuesday.

Quick snapshot

Breed X2013.Rank X2014.Rank X2015.Rank X2016.Rank X2017.Rank X2018.Rank X2019.Rank X2020.Rank links Image LocalImage Affectionate.With.Family Good.With.Young.Children Good.With.Other.Dogs Shedding.Level Coat.Grooming.Frequency Drooling.Level Coat.Type Coat.Length Openness.To.Strangers Playfulness.Level Watchdog.Protective.Nature Adaptability.Level Trainability.Level Energy.Level Barking.Level Mental.Stimulation.Needs
Affenpinschers 143 144 136 149 147 148 152 163 https://www.akc.org/dog-breeds/affenpinscher/ https://www.akc.org/wp-content/uploads/2017/11/Affenpinscher-illustration.jpg img/Affenpinschers_new.png 3 3 3 3 3 1 Wiry Short 5 3 3 4 3 3 3 3
Akitas 45 46 46 46 47 47 47 48 https://www.akc.org/dog-breeds/akita/ https://s3.amazonaws.com/cdn-origin-etr.akc.org/wp-content/uploads/2017/11/06155318/Akita-illustration.jpg img/Akitas_new.png 3 3 1 3 3 1 Double Medium 2 3 5 3 3 4 2 3
Azawakhs NA NA NA NA NA NA NA 193 https://www.akc.org/dog-breeds/azawakh/ https://www.akc.org/wp-content/uploads/2017/11/Azawakh_illustration.bw_.1.jpg img/Azawakhs_new.png 3 3 3 2 2 1 Smooth Short 1 3 3 3 2 3 1 3
Barbets NA NA NA NA NA NA NA 145 https://www.akc.org/dog-breeds/barbet/ https://www.akc.org/wp-content/uploads/2017/11/Barbet-illustration.jpg img/Barbets_new.png 4 5 5 1 3 1 Curly Medium 3 3 3 3 4 3 3 3
Basenjis 85 86 87 88 84 87 88 86 https://www.akc.org/dog-breeds/basenji/ https://www.akc.org/wp-content/uploads/2017/11/Basenji-Illo-2.jpg img/Basenjis_new.png 3 3 3 2 1 1 Smooth Short 3 3 3 3 2 4 1 4
Beagles 4 5 5 5 6 6 7 7 https://www.akc.org/dog-breeds/beagle/ https://www.akc.org/wp-content/uploads/2017/11/Beagle-Illo-2.jpg img/Beagles_new.png 3 5 5 3 2 1 Smooth Short 3 4 2 4 3 4 4 4

Any missing values?

Which dog breeds were the favorite in 2020?

Which dog breeds are the favorites over the years?

Let’s look at the top 10 dog breeds since 2013.

Ok, maybe we should look at another graph without the cute pictures?

Wow, what a comeback for the Pembroke Welsh Corgis!

What about the underdogs?

##  [1] "Affenpinschers" "Akitas"         "Azawakhs"       "Barbets"       
##  [5] "Basenjis"       "Beagles"        "Beaucerons"     "Bloodhounds"   
##  [9] "Boerboels"      "Borzois"        "Boxers"         "Briards"       
## [13] "Brittanys"      "Bulldogs"       "Bullmastiffs"   "Chihuahuas"    
## [17] "Chinooks"       "Collies"        "Dachshunds"     "Dalmatians"    
## [21] "Greyhounds"     "Harriers"       "Havanese"       "Keeshonden"    
## [25] "Komondorok"     "Kuvaszok"       "Leonbergers"    "Lowchen"       
## [29] "Maltese"        "Mastiffs"       "Newfoundlands"  "Otterhounds"   
## [33] "Papillons"      "Pekingese"      "Pointers"       "Pomeranians"   
## [37] "Poodles"        "Pugs"           "Pulik"          "Pumik"         
## [41] "Rottweilers"    "Salukis"        "Samoyeds"       "Schipperkes"   
## [45] "Sloughis"       "Vizslas"        "Weimaraners"    "Whippets"      
## [49] "Xoloitzcuintli"

What are the most important traits for predicting the overall ranking in 2020?

Work in progress - still figuring out the merge :)

Using random forest, we can see how important the different traits are for the 2020 ranking.

rf1 <- randomForest(X2020.Rank ~ . , data = select(df, -colnames(df)[grepl("X201[0-9].Rank", colnames(df))], -LocalImage, -Image, -links, -Breed), importance=TRUE) # fit the random forest with default parameter

Let’s take a look at the importance, evaluated using the percent increase in MSE (higher is more important).

What do these scores mean?

We can check out the trait description table to see how each of these measures was coded. I wish I could take these survey! Look at those value assignments for upper and lower bounds!

Trait Trait_1 Trait_5 Description
Good With Young Children Not Recommended Good With Children A breed’s level of tolerance and patience with childrens’ behavior, and overall family-friendly nature. Dogs should always be supervised around young children, or children of any age who have little exposure to dogs.
Drooling Level Less Likely to Drool Always Have a Towel How drool-prone a breed tends to be. If you’re a neat freak, dogs that can leave ropes of slobber on your arm or big wet spots on your clothes may not be the right choice for you.
Coat Type - - Canine coats come in many different types, depending on the breed’s purpose. Each coat type comes with different grooming needs, allergen potential, and shedding level. You may also just prefer the look or feel of certain coat types over others when choosing a family pet.
Coat Length - - How long the breed’s coat is expected to be. Some long-haired breeds can be trimmed short, but this will require additional upkeep to maintain.
Openness To Strangers Reserved Everyone Is My Best Friend How welcoming a breed is likely to be towards strangers. Some breeds will be reserved or cautious around all strangers, regardless of the location, while other breeds will be happy to meet a new human whenever one is around!
Playfulness Level Only When You Want To Play Non-Stop How enthusiastic about play a breed is likely to be, even past the age of puppyhood. Some breeds will continue wanting to play tug-of-war or fetch well into their adult years, while others will be happy to just relax on the couch with you most of the time.
Barking Level Only To Alert Very Vocal How often this breed vocalizes, whether it’s with barks or howls. While some breeds will bark at every passer-by or bird in the window, others will only bark in particular situations. Some barkless breeds can still be vocal, using other sounds to express themselves.

How did the top dog breeds score on these key traits?

Using a boxplot to see the difference in distributions across traits between the top 25% (alphas) of dogs according to the 2020 ranking tend to score compared to the other 75% (betas?)

Disclaimer: All dogs are alpha to me, except for my dog, who is the super alpha.

How do tippy top dogs score on the most important traits?

Bringing this back around to the favorite top 10

My favorite dog?

Jack, my best bud